Presenter: HMW Category: graphical models Preference: Oral Evaluation Methods for Topic Models
نویسندگان
چکیده
Statistical topic modeling has become a popular tool for analyzing large, unstructured text collections. There is a significant body of work developing sophisticated topic models and their applications. To date, however, the task of evaluating topic models has not been specifically addressed. Evaluation is an important issue: the unsupervised nature of topic models makes model selection difficult. For some applications there may be extrinsic tasks, such as information retrieval or document classification, for which performance can be evaluated. More universally, however, a topic model’s ability to generalize can be measured by computing the probability of held-out documents under the model, which is independent of any specific application. Here, we consider the simplest topic model, latent Dirichlet allocation (LDA) [? ], and compare a number of methods for estimating the probability of held-out documents given a trained model. Our empirical results on synthetic and real-world data sets show that the estimators currently used in the topic modeling literature are much less accurate and have higher variance than two proposed alternative methods. These proposed alternatives are also applicable to more complicated topic models.
منابع مشابه
Presenter: HMW Category: graphical models Preference: Oral Polylingual Topic Models
Statistical topic models are a useful tool for analyzing large, unstructured document collections [1, 2]. Such collections are increasingly available in multiple languages. Previous work on bilingual topic modeling [4] has focused on aligning pairs of translated sentences. In contrast, we consider “loosely parallel” corpora, in which tuples of documents in different languages are not direct tra...
متن کاملGraphical models - methods for data analysis and mining
The best ebooks about Graphical Models Methods For Data Analysis And Mining that you can get for free here by download this Graphical Models Methods For Data Analysis And Mining and save to your desktop. This ebooks is under topic such as data mining with graphical models pdfsmanticscholar data mining with graphical models borgelt data mining with graphical models springer data mining with poss...
متن کاملThe evaluation of Cox and Weibull proportional hazards models and their applications to identify factors influencing survival time in acute leukem
Introduction: The most important models used in analysis of survival data is proportional hazards models. Applying this model requires establishment of the relevance proportional hazards assumption, otherwise it world lead to incorrect inference. This study aims to evaluate Cox and Weibull models which are used in identification of effective factors on survival time in acute leukemia. Me...
متن کاملNew Applications on Linguistic Mathematical Structures and Stability Analysis of Linguistic Fuzzy Models
In this paper some algebraic structures for linguistic fuzzy models are defined for the first time. By definition linguistic fuzzy norm, stability of these systems can be considered. Two methods (normed-based & graphical-based) for stability analysis of linguist fuzzy systems will be presented. At the follow a new simple method for linguistic fuzzy numbers calculations is defined. At the end tw...
متن کاملA Review Of DEA Approaches For Health Supply Chain
More than 200 papers have been published in the last 20 years on the topic of health supply chains (HSC). Looking at the research methodologies employed, less than 15 papers apply data envelopment analysis (DEA) models. This is in contrast to, for example, A Network Data Envelopment Analysis (NDEA) Model for Supply Chain Performance Evaluation where several reviews on respective NDEA models hav...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009